Statistical modeling of pronunciation and production variations for speech recognition
نویسندگان
چکیده
In this paper, we propose a procedure for training a pronunciation network with criteria consistent with the optimality objectives for speech recognition systems. In particular, we describe a framework for using maximum likelihood(ML) and minimum classi cation error(MCE) criteria for pronunciation network optimization. The ML criterion is used to obtain an optimal structure for the pronunciation network based on statistically-derived phonological rules. Discrimination among di erent pronunciation networks is achieved by weighting of the pronunciation networks, optimized by applying the MCE criterion. Experinent results demonstrate improvements in speech recognition accuracy after applying statistically derived phonological rules. It is shown that the impact of the pronunciation network weighting on the recognition performance is determined by the size of the recognition vocabulary.
منابع مشابه
Study on Framework for Chinese Pronunciation Variation Modeling
The pronunciation variations, which badly influenced the performance of ASR system, are serious in continuous speech, especially in spontaneous speech. Many research works are focused on pronunciation variation modeling in recent years. A framework for Chinese pronunciation variation modeling is described in this paper. The main idea is that the pronunciation variations are hidden in the recogn...
متن کاملSpeaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech
The presented paper addresses the problem of creating hidden Markov models for fast speech. The major issues discussed are robust parameter estimation and reducing within-model variations. Regarding the first issue, the use of the maximum a posteriori parameter estimation is discussed. To reduce within-model variations, a maximum likelihood based vocal tract length normalization procedure and a...
متن کاملPronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition
In this paper, we describe a pronunciation lexicon model which is especially useful for constructing morpheme-based pronunciation lexicon to improve the performance of a Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. For modeling of cross-morpheme pronunciation variations, we usually used a context-dependent multiple pronunciatio...
متن کاملModeling Systematic Variations in Pronunciation
This paper describes the research efforts of the “Hidden Speaking Mode” group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use of a hidden speaking mode to represent systematic variations that are correlated with the word seq...
متن کاملModeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode
This paper describes the research efforts of the “Hidden Speaking Mode” group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use of a hidden speaking mode to represent systematic variations that are correlated with the word seq...
متن کامل